CalibreKindleCollections.py
===========================
A simple script for handling collections on the Amazon Kindle.

The Kindle is a great reading device, unfortunately so far it is rather painful to use it for a large collection of documents -
for example when dealing with many research papers. 

Uses the metadata.calibre file generated by Calibre.
 
As Amazon doesn't seem to support this functionality yet, this is a hack which also requires that the Kindle be restarted to update the system.
This is quite annoying, but there doesn't seem to be another way until Amazon releases its `Kindle Development Kit`.
At which point this script will hopefully become obsolete.

Usage

    $python2.6 CalibreKindleCollections.py [options]

Use the line below to find the available options

    $python2.6 CalibreKindleCollections.py -h

Note that if the script is saved in the root folder of the Kindle (e.g. /Volumes/Kindle), the mount point is not necessary. 

Please feel free to use and extend the script in any way you want. If you already have a valued collection of collections,
it might also be a good idea to backup this file before you give this a try.


What this script does
=====================
Authors: A collection will be created for each author of the name "<prefixAuthor><authorname>", provided that:
    a) "--na" is NOT specified on the command line
    b) The author's collection would contain at least minBooksForAuthor books

Tags: A collection will be created for each tag of the name "<prefixTag><tagname>", provided that:
    a) "--nt" is NOT specified on the command line
    b) Only tags which start with the string given in collectionTagPrefix will produce collections
    c) The tag's collection would contain at least minBooksForTag books

Series: A collection will be created for each series of the name "<prefixSeries><seriesname> - <authorname>", provided that:
    a) "--ns" is NOT specified on the command line
    b) The series's collection would contain at least minBooksForSeries books
    c) Note that the <authorname> is the name of one random author of book(s) in the series (and once chosen will persist across updates to collections)

Miscellany: Any files on the Kindle which are not in a collection when this script has run are automatically placed in the collection
miscellanyCollectionName, which will be created if necessary. Additional by-genre miscellany collections may also be defined in the script.
This behaviour may be switched off by specifying the command-line option "--nm"

By default, Kindle collections are updated: Additions may be made to existing collections and new collections created but pre-existing collections
will be left alone (unless they're empty, in which case see below). However, specifying the "--rb" command-line option will cause existing
collections to be wiped and completely replaced with new collections built based on the data from Calibre.

If you specify the "--c" option on the command-line then this script will attempt to remove from collections any references to books which no longer
exist in the Kindle's file system. Note that this option should only be used if you have no books which contain embedded asin IDs in their metadata,
such as azw files or mobi files converted directly from azw files or any other affected files, then you should NOT use this option as it will
remove such books from collections (the books will still be in the kindle, but will no longer be in any collections). See "Known Issues" below.

If you specify the "--e" option on the command-line then any collections which are empty after all operations have been carried out will be deleted.
The same caveats apply here as to the "--c" command, above.

By default, the script will produce a final report on the collections on the Kindle. This behaviour can be disabled using "-nr" on the command line.
See "Known Issues", below, for a note on how this may be crucial for some users.

If you are rebuilding all of your collections then this script automatically sets their lastAccess time to ensure that they will (initially at least)
show up sorted into alphabetical order on the kindle's home page in "collections" view. If you're updating collections then the lastAccess time
for your collections will simply be set to the current date/time on those collections which have been changed; or you can use the command-line
option "--sort" to resort ALL of the collections on your kindle into alphabetical order. Note, though, that this WILL overwrite any existing
lastAccess data for all of your collections.

Finally, the script can be instructed to produce fewer output messages using the command-line switch "--quiet" or to produce many many more
messages using the switch "--verbose"


Running this Script
===================
The simplest way to run the script is to copy it to your Kindle's root folder (the one above the "documents" folder) and run it from there.
If you're using Windows, just download and install Python 2.6 or 2.7 and you can then run the script simply by double-clicking on it. Or,
if you wish to use the command-line options, you might want to create a batch file with a line like this in it:

	C:\Python27\python.exe "K:\CalibreKindleCollections.py" --rebuildcollections --noseries > CalibreKindleCollections_log.txt

That line will run the script with the options to recreate all collections from scratch, not to create collections based on series (so just
use authors and tags) and will send the output of the script (a description of the collections it's created) to a text file for later perusal.
Just change "C:\Python27\" to the folder where you installed Python, "K:" to the drive letter of your Kindle and set the command line
options you prefer.

Similarly, the command:

	C:\Python27\python.exe "K:\CalibreKindleCollections.py" --quiet --noupdatecollections --nocleanupdeadfiles --nocleanupemptycollections --nomiscellanycollection > CalibreKindleCollections.log

or, using the short forms of the command-line options:

	C:\Python27\python.exe "K:\CalibreKindleCollections.py" --q --nu --nc --ne --nm > CalibreKindleCollections.log

Will simply produce a file, CalibreKindleCollections.log, listing the collections currently on the Kindle and their contents but will not
change those collections in any way.


Some notes on how to use this script
====================================
Short version:

    1) Set up your tags, series, etc as you want them in Calibre. Add a dash (-) to the start of any tags you want to use for collections
    2) Make sure Calibre's "metadata management" setting is set to "automatic management" (the setting's in Preferences->Sending Books to Devices)
    3) Connect your kindle and allow calibre to detect it. Allow Calibre to send its metadata to your kindle.
    4) Run the script, with command-line options if you wish
    5) Restart your Kindle by pressing and holding the power button until it restarts or by going Home->Menu->Settings->Menu->Restart Kindle
    6) Once your Kindle restarts it will take a few minutes to work out where to put everything but once it's done you should have properly
        organised collections based on your metadata in calibre. Collections created by this script are prefixed with "- " (a dash and a space),
        so that if you view your home page sorted by title then the collections will show first, in alphabetical order, followed by your individual
        books, also in alphabetical order by title.
	7) If you wish, you can change Calibre's metadata setting back to "Only on Send" or "Manual management" after it's sent the metadata to
		your Kindle. Once good reason for doing this is that when "automatica management" is set then Calibre refuses to read collections
		info from the Kindle for display on the "Device" page in Calibre. The colelctions are still there and in use by the Kindle, but for some
		reason (I don't know why) Calibre won't read/display them.

If you have wireless switched on then it's possible that your kindle will retrieve your old collections from Amazon instead of using the
new ones. If that happens, repeat the above steps but switch wireless off before you run the script and don't switch it back on again until
after the Kindle restarts.

Note that the script uses *ONLY* the metadata which Calibre has sent to the Kindle about the books which are in Calibre's library *and* on the
Kindle. Thus, this script cannot manage files unless the file is in Calibre's library and the metadata from Calibre is up to date on the kindle.
Hence the need to ensure that Calibre updates its metadata on the Kindle before running the script.


Detailed version:

For the collections to be accurately built, it is essential that your kindle has the current metadata from Calibre. This will *ONLY* be the
case if Calibre's "metadata management" setting is set to "automatic management".

However, if that setting is in effect then Calibre is unable to read the collections data from the kindle for display in the "Device" page.
If this is a concern then the only solution is to:

    1) Set "metadata management" to "automatic management" in Calibre
    2) Disconnect your kindle if it's already connected. Or, alternatively, completely exit Calibre.
    3) Connect your kindle (or start up Calibre)
    4) Calibre should detect the Kindle and send it a sizeable package of metadata
    5) Once that metadata is sent then you can run this script and your collections will be created *based on the tags as they were when that
        packet was sent*
    6) Once the metadata has been sent, you can switch Calibre's metadata management option back to your preferred setting until the next time
        you wish to run this script

Once the script's run you need to restart your Kindle *immediately*. If you have SSH server enabled on your Kindle then rather than restarting
the kindle you can send the following command to your Kindle instead:

    ssh root@yo.ur.I.P 'dbus-send --system /default com.lab126.powerd.resuming int32:1' 

Search for the "usbnetwork" hack for details on to how to set up an SSH server on your kindle, but frankly I wouldn't recommend it just for this.


Known Issues
============
1) Some books have an embedded Amazon asin id somewhere in their metadata, and the Kindle uses that asin id in preference to a calculated one.
    Such books cannot be assigned to collections by this script (or, rather, they can be assigned but the kindle won't put them into collections).

    There are two possible workarounds. The simplest is to convert those books to another format, such as epub, and then convert them back to
    mobi, re-send them to the kindle and then rebuild your collections.

    If that's not possible, or desirable, then the alternative is to either manually move them to collections yourself on the kindle or to
    add such books, individually, as exceptions in the getAsin() function in this script, in the same way that the Kindle User Guide is handled
    by the code at present.

    Be aware, though, that to do this you will need to obtain the book's asin id AND that the collections-assignment of such books will not
    be visible in Calibre (they'll show up on the "device" page as not being in a collection).

    Also, if you do go this route then ensure that you *ALWAYS* run this script with the "--nc" command-line option (i.e. disable the removal
    from collections of books which could not be found on the Kindle file system), since such books cannot be located in the file system by
    their embedded asin id, and so would be auto-removed by this script if that option is not used to disable that action.

    One way to obtain the book's internal asin is to, on the kindle, create a new collection and manually add that specific book *and no others*
    to it. Then connect the kindle to your computer and open its collections.json file. Look for the newly-created collection and the id number of
    the sole item inside that collection is the asin id which the kindle uses for that book.

    Unless/Until some method is found for this script to automatically retrieve such values from the eBook in such cases, there's no way to
    automate that process, unfortunately.

2) If a book is moved out of a collection (but is still on the kindle and otherwise unchanged) and the script is running in update mode then it
    relies on the kindle to detect that change rather than doing it itself. This won't be changed, most likely, as such detection would require
    this script to retain a snapshot of the previous state of the collections and it's just not worth it.

    Similarly, if you scripted to create collections for authors who have at least 3 books and then later change that limit to 4 books the
    previously-created collections will not be removed if you run the script in update mode. This is intentional: Update mode should never
    delete collections which have books inside them. Either manually delete those collections or run the script in rebuild mode.

3) Some Calibre metadata may contain unicode characters which cannot be displayed by this script. If this happens then the script may throw an error
	when it attempts to produce output for such files. 